- Title
- Common code segment selection: semi-automated approach and evaluation
- Creator
- Karnalim, Oscar; Simon,
- Relation
- 52nd ACM Technical Symposium on Computer Science Education. SIGCSE '21: Proceedings of the 52nd ACM Technical Symposium on Computer Science Education (Online 13-20 March, 2021) p. 335-341
- Publisher Link
- http://dx.doi.org/10.1145/3408877.3432436
- Publisher
- Association for Computing Machinery (ACM)
- Resource Type
- conference paper
- Date
- 2021
- Description
- When comparing student programs to check for evidence of plagiarism or collusion, the goal is to identify code segments that are common to two or more programs. Yet some code segments are common for reasons other than plagiarism or collusion, and so should not be considered. A few code similarity detection tools automatically remove very common segment, but they are prone to false results as no human validation is involved. This paper proposes a semi-automated approach for excluding common segments, where human validation is introduced before excluding the segments. As existing selection techniques are not detachable from their similarity detection tools, we propose a new tool to independently select the segments (C2S2), along with several adjustable selection constraints to keep the number of suggested segments reasonable for manual observation. In order to independently evaluate automated selection techniques, we propose and apply three metrics. The evaluation shows our selection technique to be more effective and efficient than the basis underlying existing selection techniques, and establishes the benefit of each of its selection features.
- Subject
- common code segment; semi-automated approach; n-gram; code similarity; plagiarism; collusion
- Identifier
- http://hdl.handle.net/1959.13/1463829
- Identifier
- uon:46848
- Identifier
- ISBN:9781450380621
- Language
- eng
- Reviewed
- Hits: 691
- Visitors: 691
- Downloads: 0
Thumbnail | File | Description | Size | Format |
---|